Continuity of the Value of Competitive Markov Decision Processes

نویسنده

  • Eilon Solan
چکیده

A Markov Decision Process (MDP) is given by (i) a finite set of states S and an initial state s1 ¥ S, (ii) a finite set of actions A, (iii) a cost function c: S×AQ R, and (iv) a transition rule p: S×AQ D(S), where D(S) is the space of probability distributions over S. At every stage n ¥ N, where N is the set of positive integers, the process is in some state sn ¥ S. The decision maker chooses an action an ¥ A, and a new state sn+1 ¥ S is chosen according to p( · | sn, an). It is assumed that the decision maker remembers the sequence of states the process visited and his past actions. Denote by H=1n ¥ N (S×A)×S the set of all finite histories, where by convention, B=” for every finite set B and we identify ”×S with S. A plan of the decision maker is a function s which assigns to every finite

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

Optimal Control of Piecewise Deterministic Markov Processes with Finite Time Horizon

In this paper we study controlled Piecewise Deterministic Markov Processes with finite time horizon and unbounded rewards. Using an embedding procedure we reduce these problems to discrete-time Markov Decision Processes. Under some continuity and compactness conditions we establish the existence of an optimal policy and show that the value function is the unique solution of the Bellman equation...

متن کامل

Countable State Markov Decision Processes with Unbounded Jump Rates and Discounted Cost: Optimality Equation and Approximations

This paper considers Markov decision processes (MDPs) with unbounded rates, as a function of state. We are especially interested in studying structural properties of optimal policies and the value function. A common method to derive such properties is by value iteration applied to the uniformised MDP. However, due to the unboundedness of the rates, uniformisation is not possible, and so value i...

متن کامل

Conditional Value-at-Risk Minimization in Finite State Markov Decision Processes: Continuity and Compactness

This study is concerned with the dynamic risk-analysis for finite state Markov decision processes. As a measure of risk, we consider conditional value-at-risk(CVaR) for the real value of the discounted total reward from a policy, under whose criterion risk optimal or deterministic policies are defined. The risk problem is equivalently redefined as a non-linear optimization problem on the attain...

متن کامل

On terminating Markov decision processes with a risk-averse objective function

We consider a class of undiscounted terminating Markov decision processes with a risk-averse exponential objective function and compact constraint sets. After assuming the existence of an absorbing cost-free terminal state , positive transition costs away from , and continuity of the transition probability and cost functions, we establish (i) the existence of a real-valued optimal cost function...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003